10/23/2021

Index.

1.- Executive Summary.

2.- Regression AAPL vs S&P500.

3.- AAPL Close vs SP500 & NASDAQ100.

4.- Months, days and trading sessions.

5.- Regression AAPLre vs SP500re.

6.- Betas daily rolling 3,5,7,10 years.

7.- Beta summary.

8.- Look ahead: Building a Forecast model.

Index (Contd…)

9.- Model Structure.

10.-Regression Model: dataset.

11.-Time-related predictors.

12.-Market-related predictors.

13.-Discount rates -related predictors.

14.-Model Output: ANOVA.

15.-Model Output: Coefficients.

16.-But there is a concern: multicolinearity.

Index (Contd…)

17.-Predictors correlations.

18.-Residuals Plot.

19.-Root Mean Squared Error (RMSE).

20.-Apple Close (Forecast).

21.-Risk: Probability of permanent loss.

22.-Risk goes down with longer hold.

Executive Summary.

This presentation summarizes the work done to find a model to represent and predict Apple’s share price (ticker AAPL) at the closing of every session, and forecast future prices a number of sessions forward ( in this case 21 sessions, about a month of trading).

The data set covers trading sessions since January 1990 until October 2021. The model is built for the dataset to update itself (through R code reaching out to the Yahoo platform), predict and update the model with new data, and forecast 21 sessions forward.

The last update for the purpose of this document was performed on October 21, 2021.

Executive Summary (Contd….)

The closing of AAPL is represented by a linear regression model which includes predictors such us sequential days of trading, year, month, day, the closing level of the S&P500 and the NASDAQ100, 5 year, 10 year, and 30 year Treasury prices, and the dollar index.

The relationship between the predictors and the AAPL shares is not always linear. The predictors are hence transformed to reflect this, and then lagged 21 periods (trading sessions) to forecast forward closings.

There are signs of colinearity (predictors highly correlated with each other) in the model. This needs further research, certainly constitutes an opportunity for improvement, and is beyond the scope of this work for the time being.

Executive Summary (Contd….)

The model predicts AAPL closing with an error of 1.40 usd/share on the training set and 10.34 usd/share on the testing set. A total of 87% of the actual data points on the testing set are within 1 standard error of 10.34 usd/share.

Apple’s closing on October 21 was 149.48 usd/share (Yahoo Finance). The model predicts 151.89 usd/share for that day.

The model forecasts AAPL to close on October 22, 2021 (next trading session) at 156.19 usd/share. AAPL actually closed at 148.69 (Yahoo Finance).

The forecast for 25 October was 156.36 usd/share, AAPL closed at 148.64 (Yahoo Finance).

Executive Summary (Contd….)

All this may be explored on the graphs. Beta coefficients versus the S&P500 are also derived and explored. The 5 years rolling Beta to the S&P500 reported by the model is 1.21 (verus 1.23 reported by Yahoo Finance). Beta is often interpreted as a measure of volatility and risk. The evolution of Beta for 3,5,7, and 10 years may be explore interactively on the graphs.

This work also suggest a different approach to see and measure risk. I suggest that risk be understood as the probability of permanent loss. That requires a buy-hold-sell strategy within a window of time. The time window goes from 1,5,10,20,50,100,200(approx 1 Year),400,1000,2000(10 Years),4000 trading sessions. The wider the time window, the lower the probability of permanent loss –the lower the risk–.

Regression AAPL vs S&P500.

lm(formula = AAPLClose ~ (. - date - Month - X.TYXClose - DX.Y.NYBClose - 
    X.DJIClose) + factor(Month, exclude = c(5, 11, 4, 12, 6, 
    7, 3, 2)), data = FullDF)
[1] "Residuals:"
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-22.8786  -2.1529  -0.2152   0.0000   3.3252  14.5205 
[1] "Adjusted R squared:"
[1] 0.9746672
Adjusted R squred at  0.97  means that the model explains
 97.47  percent of the total variance.

Regression AAPL vs S&P500. (Contd…)

[1] "F-statistic:"
   value    numdf    dendf 
10442.96    10.00  2704.00 
[1] "P-value:"
value 
    0 
High F-statictic and low (zero) p-values means that the model is significant.
 Perhaps colinearity is present.

Regression AAPL vs S&P500. (Contd…)

                 Estimate   Std. Error    t value      Pr(>|t|)
(Intercept) -7.559765e+05 2.777576e+04 -27.217129 2.244391e-144
id          -1.510236e+00 5.549118e-02 -27.215785 2.309875e-144
Year         3.799028e+02 1.395800e+01  27.217575 2.223078e-144
Day          1.050600e+00 4.034714e-02  26.039021 1.442891e-133
SP500       -2.338573e-02 9.150650e-04 -25.556364 3.200437e-129
NDXClose     1.572176e-02 1.923123e-04  81.751207  0.000000e+00
X.TNXClose  -2.670087e+00 2.112474e-01 -12.639619  1.272616e-35
X.IRXClose  -1.289807e-01 1.159453e-01  -1.112427  2.660533e-01
Month8       2.226430e+02 8.163735e+00  27.272197 6.905232e-145
Month9       2.542560e+02 9.334650e+00  27.237871 1.439944e-144
Month10      2.865822e+02 1.051457e+01  27.255718 9.827244e-145
P-values are all close to zero. This means the coefficient for the 
 predictors are significant; perhaps colinearity is present. 

Regression AAPL vs S&P500. (Contd…)

Analysis of Variance Table

Response: AAPLClose
             Df  Sum Sq Mean Sq   F value    Pr(>F)    
id            1 1328284 1328284 53545.136 < 2.2e-16 ***
Year          1     507     507    20.458 6.358e-06 ***
Day           1     146     146     5.883   0.01535 *  
SP500         1  939189  939189 37860.123 < 2.2e-16 ***
NDXClose      1  299840  299840 12086.987 < 2.2e-16 ***
X.TNXClose    1    2292    2292    92.391 < 2.2e-16 ***
X.IRXClose    1    1844    1844    74.339 < 2.2e-16 ***
Month(f)      3   18464    6155   248.105 < 2.2e-16 ***
Residuals  2704   67078      25                        
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

AAPL Close vs SP500 & NASDAQ100.

AAPL Close vs SP500 & NASDAQ100 (y secondary) ( secondary y axis does not display properly in ggplotly. Multiply main y axis by 100 to read SP500 and Nasdaq100.)

Months, days and trading sessions.

Before we look at monthly returns, let’s explore what “Monthly” means as an average long-run number of sessions in the market. The data frame has a total of:

[1] 8015

records. These are daily sessions of the S&P500. These sessions cover dates from:

[1] "1990-01-02"

excluding all Saturdays, all Sundays, and all trading holidays, until:

[1] "2021-10-21"

Months, days and trading sessions (Contd…).

The difference in months is:

[1] 381.6247

The average length of a month in this data set is:

[1] 21

AAPL Monthly %Returns vs Monthly %Returns SP500 & NASDAQ100

Regression AAPLre vs SP500re.

Call:
glm(formula = AAPLre ~ SP500re, data = returnsMonthlyDF)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-65.152   -6.215   -0.111    6.054  105.809  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.49610    0.12935   11.57   <2e-16 ***
SP500re      1.23440    0.02854   43.26   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 130.1822)

    Null deviance: 1286723  on 8014  degrees of freedom
Residual deviance: 1043150  on 8013  degrees of freedom
AIC: 61774

Number of Fisher Scoring iterations: 2

Regression AAPLre vs SP500re (Contd…)

Beta APPLre vs SP500re:
[1] 1.234399
Yahoo Finance Beta (5Y Monthly) = 1.22

Apple Beta vs SP500 in 3,5,7,10 daily rolling years:

    xids beta3Y beta5Y beta7Y beta10Y
346  346   1.25   1.21   1.26    1.21

Betas daily rolling 3,5,7,10 years.

AAPL Betas vs S&P500 daily rolling 3, 5, 7, 10 Years.

Beta summary.

Betas where calculated using rolling:

[1] 21

days. Each iteration of the linear regression is calculated starting

[1] 21

days ahead of the previous one. Betas calculated over different periods of time have fluctuated in the past. They have all converged in recent periods to a range between 1.21, and 1.25. Yahoo finance reports on 2021-10-14 a 5Y Beta for AAPL of 1.23. Pretty good results !

Look ahead: Building a Forecast model.

What is APPL Closing price likely to be X trading sessions ahead ?

Model Structure.

The model takes 3 types of predictors:

1.- time-related,

2.- market related, and

3.- fundamentals-related.

Each group of predictors aims at explaining different forces which make APPL shares go up and down in price.

Model Structure (Contd…)

Lag: market and fundamentals predictors will be lagged

[1] 21

periods to be able to forecast that same amount periods forward, in the last stage. Lagging variables allows us to bring past data into the future. Time variables do not need to be lagged since sequential session id, year, month and day can be precisely established going forward.

Regression Model: dataset.

## 'data.frame':    8015 obs. of  10 variables:
##  $ id     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ date   : Date, format: "1990-01-02" "1990-01-03" ...
##  $ Year   : num  1990 1990 1990 1990 1990 1990 1990 1990 1990 1990 ...
##  $ Month  : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Day    : num  2 3 4 5 8 9 10 11 12 15 ...
##  $ YearSq : num  3960100 3960100 3960100 3960100 3960100 ...
##  $ MonthSq: num  1 1 1 1 1 1 1 1 1 1 ...
##  $ DaySq  : num  4 9 16 25 64 81 100 121 144 225 ...
##  $ SP500  : num  360 359 356 352 354 ...
##  $ SP500Sq: num  129377 128709 126501 124045 125167 ...

Regression Model: dataset (Contd…)

## 'data.frame':    8015 obs. of  10 variables:
##  $ DX.Y.NYBClose: num  94.3 94.4 92.5 92.8 92.1 ...
##  $ NDXClose     : num  228 226 225 223 224 ...
##  $ X.DJIClose   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ X.TNXClose   : num  7.94 7.99 7.98 7.99 8.02 8.02 8.03 8.04 8.1 8.1 ...
##  $ TNX10Ytrans  : num  0.318 0.316 0.316 0.316 0.314 ...
##  $ X.IRXClose   : num  7.58 7.63 7.59 7.54 7.54 7.55 7.5 7.55 7.5 7.5 ...
##  $ TIX5Ytrans   : num  0.334 0.332 0.334 0.336 0.336 ...
##  $ X.TYXClose   : num  8 8.04 8.04 8.06 8.09 8.1 8.11 8.11 8.17 8.17 ...
##  $ TYX30Ytrans  : num  0.315 0.313 0.313 0.313 0.311 ...
##  $ AAPLClose    : num  0.333 0.335 0.336 0.337 0.339 ...

Time-related predictors.

Market-related predictors.

Discount rates -related predictors.

Model Output: ANOVA.

Analysis of Variance Table

Response: AAPLClose
                Df Sum Sq Mean Sq   F value    Pr(>F)    
id               1  44094   44094 22295.518 < 2.2e-16 ***
Year             1     89      89    45.161 1.995e-11 ***
Month            1    445     445   224.981 < 2.2e-16 ***
Day              1   3654    3654  1847.383 < 2.2e-16 ***
MonthSq          1     71      71    35.714 2.426e-09 ***
SP500            1   1980    1980  1000.938 < 2.2e-16 ***
SP500Sq          1  11101   11101  5613.227 < 2.2e-16 ***
DX.Y.NYBClose    1    625     625   316.083 < 2.2e-16 ***
X.TNXClose       1    763     763   385.654 < 2.2e-16 ***
TNX10Ytrans      1   8447    8447  4271.361 < 2.2e-16 ***
X.IRXClose       1   1043    1043   527.540 < 2.2e-16 ***
TIX5Ytrans       1    174     174    87.813 < 2.2e-16 ***
X.TYXClose       1   1955    1955   988.306 < 2.2e-16 ***
TYX30Ytrans      1    650     650   328.585 < 2.2e-16 ***
Residuals     5595  11065       2                        
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model Output: ANOVA (Contd…)

Call:
   aov(formula = LagLM)

Terms:
                      id     Year    Month      Day  MonthSq    SP500  SP500Sq
Sum of Squares  44093.56    89.31   444.94  3653.54    70.63  1979.54 11101.21
Deg. of Freedom        1        1        1        1        1        1        1
                DX.Y.NYBClose X.TNXClose TNX10Ytrans X.IRXClose TIX5Ytrans
Sum of Squares         625.11     762.70     8447.42    1043.31     173.67
Deg. of Freedom             1          1           1          1          1
                X.TYXClose TYX30Ytrans Residuals
Sum of Squares     1954.56      649.84  11065.16
Deg. of Freedom          1           1      5595

Residual standard error: 1.406303
Estimated effects may be unbalanced

Model Output: Coefficients.

  (Intercept)            id          Year         Month           Day 
 1.812747e+05  3.634371e-01 -9.110017e+01 -7.648393e+00 -2.491013e-01 
      MonthSq         SP500       SP500Sq DX.Y.NYBClose    X.TNXClose 
-3.202201e-03 -2.240315e-02  1.179981e-05 -5.875717e-02  1.095126e+01 
  TNX10Ytrans    X.IRXClose    TIX5Ytrans    X.TYXClose   TYX30Ytrans 
 2.101148e+02 -4.447771e-01 -5.809950e+00 -7.543895e+00 -1.730121e+02 

The Adjusted R-squared value for the model is:

[1] 0.8712447

which means that 87% of the variance is explained by the model, and suggests good fit.

Model Output: Coefficients (Contd…)

And its F-statistic is:

   value    numdf    dendf 
2712.019   14.000 5595.000 

with its corresponding p-value:

value 
    0 

This means that the overall regression model is significant. All the predictor variables are statistically significant at 0.05 significance level.

But there is a concern: multicolinearity.

Let’s look at the variance inflation factor. Do we have multicolinearity?

1st approach: use VIF function from “regclass” package. If VIF is greater than 5, that predictor may be colinear with another.

library( regclass ) ; VIF( LagLM). This library fails to load. I will use the vif() function in the car library instead:

           id          Year         Month           Day       MonthSq 
 1.569258e+06  1.566892e+06  3.123737e+03  2.290945e+01  1.344052e+00 
        SP500       SP500Sq DX.Y.NYBClose    X.TNXClose   TNX10Ytrans 
 1.763730e+02  1.138285e+02  2.494302e+00  2.042149e+03  1.441421e+03 
   X.IRXClose    TIX5Ytrans    X.TYXClose   TYX30Ytrans 
 1.349987e+02  1.471874e+02  2.615983e+03  1.843824e+03 

Predictors correlations.

2nd approach: look at correlations. We want low correlation amongst predictors.

                        id         Year         Month           Day
id             1.000000000  0.998989925  0.0077156840  1.985499e-03
Year           0.998989925  1.000000000 -0.0370486753 -1.748587e-03
Month          0.007715684 -0.037048675  1.0000000000 -2.605396e-04
Day            0.001985499 -0.001748587 -0.0002605396  1.000000e+00
MonthSq        0.024972918  0.006132963  0.4141369438  2.188639e-02
SP500          0.789552387  0.788946731 -0.0089078591  9.124913e-04
SP500Sq        0.738907239  0.738744947 -0.0166544350  6.338672e-04
DX.Y.NYBClose -0.314863992 -0.314734700 -0.0131790951  1.124367e-03
X.TNXClose    -0.919088299 -0.918266662 -0.0033253147  1.915890e-04
TNX10Ytrans    0.914464034  0.913807145  0.0018578711 -6.132122e-05
X.IRXClose    -0.720296780 -0.720161753  0.0017583367  1.409721e-03
TIX5Ytrans     0.730545283  0.730642890 -0.0058155332 -1.811083e-03
X.TYXClose    -0.937447547 -0.936778220  0.0026347476 -1.093113e-03
TYX30Ytrans    0.939591304  0.939110419 -0.0048986340  1.417055e-03

Predictors correlations (Contd…)

                   MonthSq         SP500       SP500Sq DX.Y.NYBClose
id             0.024972918  0.7895523871  0.7389072390  -0.314863992
Year           0.006132963  0.7889467309  0.7387449467  -0.314734700
Month          0.414136944 -0.0089078591 -0.0166544350  -0.013179095
Day            0.021886394  0.0009124913  0.0006338672   0.001124367
MonthSq        1.000000000  0.0050352353 -0.0027088429  -0.036698219
SP500          0.005035235  1.0000000000  0.9869994605   0.065229450
SP500Sq       -0.002708843  0.9869994605  1.0000000000   0.045148749
DX.Y.NYBClose -0.036698219  0.0652294500  0.0451487493   1.000000000
X.TNXClose    -0.057415666 -0.6726805474 -0.5973264730   0.242118977
TNX10Ytrans    0.057541898  0.6216374433  0.5513162352  -0.298003724
X.IRXClose    -0.026769580 -0.3133921818 -0.2318847190   0.279534573
TIX5Ytrans     0.023094762  0.2956558759  0.2175146625  -0.314183879
X.TYXClose    -0.057703919 -0.7604060524 -0.6911395610   0.196275631
TYX30Ytrans    0.060995455  0.7212262477  0.6564104529  -0.248829897

Predictors correlations (Contd…)

                X.TNXClose   TNX10Ytrans   X.IRXClose   TIX5Ytrans
id            -0.919088299  9.144640e-01 -0.720296780  0.730545283
Year          -0.918266662  9.138071e-01 -0.720161753  0.730642890
Month         -0.003325315  1.857871e-03  0.001758337 -0.005815533
Day            0.000191589 -6.132122e-05  0.001409721 -0.001811083
MonthSq       -0.057415666  5.754190e-02 -0.026769580  0.023094762
SP500         -0.672680547  6.216374e-01 -0.313392182  0.295655876
SP500Sq       -0.597326473  5.513162e-01 -0.231884719  0.217514663
DX.Y.NYBClose  0.242118977 -2.980037e-01  0.279534573 -0.314183879
X.TNXClose     1.000000000 -9.895185e-01  0.835706217 -0.826740921
TNX10Ytrans   -0.989518492  1.000000e+00 -0.844866003  0.851106737
X.IRXClose     0.835706217 -8.448660e-01  1.000000000 -0.989073730
TIX5Ytrans    -0.826740921  8.511067e-01 -0.989073730  1.000000000
X.TYXClose     0.981459267 -9.621210e-01  0.738783624 -0.728055889
TYX30Ytrans   -0.979667685  9.769873e-01 -0.750334478  0.750438633

Predictors correlations (Contd…)

                X.TYXClose  TYX30Ytrans
id            -0.937447547  0.939591304
Year          -0.936778220  0.939110419
Month          0.002634748 -0.004898634
Day           -0.001093113  0.001417055
MonthSq       -0.057703919  0.060995455
SP500         -0.760406052  0.721226248
SP500Sq       -0.691139561  0.656410453
DX.Y.NYBClose  0.196275631 -0.248829897
X.TNXClose     0.981459267 -0.979667685
TNX10Ytrans   -0.962120954  0.976987268
X.IRXClose     0.738783624 -0.750334478
TIX5Ytrans    -0.728055889  0.750438633
X.TYXClose     1.000000000 -0.992767062
TYX30Ytrans   -0.992767062  1.000000000

Predictors correlations (Contd…)

The correlation matrix shows that there is indeed high correlation between predictors, and the possibility of multicolinearity is real. Based on its Adjusted R squared and p-values, I am taking the model as good, and will not fix colinearity in this iteration of the exercise.

Residuals Plot.

AAPL Close Actual & Regression (Training set).

AAPL Close Actual & Regression (Testing set).

Root Mean Squared Error (RMSE).

The RMSE for the training set is:

[1] 1.404

and on the testing set is:

[1] 10.35

There is

[1] 87.08

percent of data points Within the black bands (1 standard error form the estimated regression line).

Apple Close (Forecast).

     c.1.nfwd. OutOfSample.date AAPLCloseOutofSample
8016         1       2021-10-22             156.1928
8017         2       2021-10-25             156.3612
8018         3       2021-10-26             155.1946
8019         4       2021-10-27             148.0768
8020         5       2021-10-28             148.8331
8021         6       2021-10-29             144.9924
8022         7       2021-11-01             148.9005
8023         8       2021-11-02             144.3647
8024         9       2021-11-03             147.9785
8025        10       2021-11-04             149.3354

Apple Close (Forecast) (Contd…)

     c.1.nfwd. OutOfSample.date AAPLCloseOutofSample
8026        11       2021-11-05             152.3610
8027        12       2021-11-08             151.1685
8028        13       2021-11-09             148.7522
8029        14       2021-11-10             147.8090
8030        15       2021-11-11             148.7072
8031        16       2021-11-12             155.1243
8032        17       2021-11-15             156.9915
8033        18       2021-11-16             157.8728
8034        19       2021-11-17             160.8905
8035        20       2021-11-18             162.6581
8036        21       2021-11-19             163.5685

[1] "Out-of-Sample Set. Trading sessions since  2021-10-21"

Apple Close (Forecast) 21 Sessions.

Risk: Probability of permanent loss.

Buy, hold and sell within a window of ‘x’ days. What is the probability of negative returns?

Hold 1  days. Mean= 0.11 SD= 2.75  Prob or perm loss= 48.4 %.

Hold 5  days. Mean= 0.57 SD= 6.05  Prob or perm loss= 46.25 %.

Hold 10  days. Mean= 1.15 SD= 8.61  Prob or perm loss= 44.69 %.

Hold 20  days. Mean= 2.34 SD= 12.35  Prob or perm loss= 42.49 %.

Hold 50  days. Mean= 6.05 SD= 20.31  Prob or perm loss= 38.29 %.

Hold 100  days. Mean= 12.43 SD= 31.1  Prob or perm loss= 34.47 %.

Hold 200  days. Mean= 26 SD= 48.72  Prob or perm loss= 29.68 %.

Hold 400  days. Mean= 59.45 SD= 93.32  Prob or perm loss= 26.2 %.

Hold 1000  days. Mean= 208.04 SD= 306.1  Prob or perm loss= 24.84 %.

Hold 2000  days. Mean= 833.55 SD= 1075.84  Prob or perm loss= 21.92 %.

Hold 4000  days. Mean= 6977.61 SD= 5358.08  Prob or perm loss= 9.64 %.

Risk goes down with longer hold.

   DaysHold ProbOfLoss
1         1      48.40
2         5      46.25
3        10      44.69
4        20      42.49
5        50      38.29
6       100      34.47
7       200      29.68
8       400      26.20
9      1000      24.84
10     2000      21.92
11     4000       9.64

Risk goes down with longer hold (Contd…)

For the computation of probabilities, the distribution is assumed to be normal. But that is not always true! See the histograms for a better feel of where losses may take place. Holding times up to 20 days (approx 1 month) carry about 50% chance of permanent loss. Above 50 days (approx 2 months) the probability of permanent loss diminishes significantly and stabilizes around the 25% mark for 1 year and above.